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Abstract. Machine learning methods are used today for most recog¬ 
nition problems. Convolutional Neural Networks (CNN) have time and 
again proved successful for many image processing tasks primarily for 
their architecture. In this paper we propose to apply CNN to small data 
sets like for example, personal albums or other similar environs where the 
size of training dataset is a limitation, within the framework of a proposed 
hybrid CNN-AIS model. We use Artificial Immune System Principles to 
enhance small size of training data set. A layer of Clonal Selection is 
added to the local filtering and max pooling of CNN Architecture. The 
proposed Architecture is evaluated using the standard MNIST dataset by 
limiting the data size and also with a small personal data sample belong¬ 
ing to two different classes. Experimental results show that the proposed 
hybrid CNN-AIS based recognition engine works well when the size of 
training data is limited in size. 
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1 Introduction 

Today all object recognition approaches use machine-learning methods. Larger 
the dataset better is the performance. Labeled datasets like NORB[T] Caltech- 
101/256[2][3] and CIFAR-10/100[4] with tens of thousands of images are in to¬ 
days scenario considered small and LabelMe[5] and Image Net [6] with millions 
of images are preferred. A simple recognition task also requires datasets of size 
of the order of tens of thousands of images [14]. It is always assumed that objects 
in realistic settings show a lot of variability and therefore to learn to recognize 
them it is essential to have much larger training sets. The many shortcomings of 
small size data sets have been widely recognized by Pinto[7]. To learn from thou¬ 
sands of objects from millions of images, a model with a large learning capacity 
with powerful processing is required. 

We present an innovative, adaptive, self-learning, and self-evolving hybrid 
recognition engine, which works well with small sized training data. The model 


uses the intelligent information processing mechanism of Artificial Immune Sys¬ 
tem (AIS) and helps Convolutional Neural Network (CNN) generate a robust 
feature set taking the small set of input training images as seeds. Our model per¬ 
forms visual pattern learning using a heterogeneous combination of supervised 
CNN and Clonal Selection (CS) principles of AIS. It can be extended to perform 
classification tasks with limited training data particularly in the context of per¬ 
sonal photo collections; where for each training sample different points of view 
are gathered in parallel using clonal selection.This is very different from pop¬ 
ulating datasets with artificially generated training examples [H] by randomly 
distorting the original training images with randomly picked distortion param¬ 
eters. 

Specific contribution of this paper is as follows: Designed a hybrid Con¬ 
volutional Neural Network- Artificial Immune System (CNN-AIS) Recognition 
Engine Architecture designed to work with modest sized training data. This is 
detailed in Section 3. The model was tested on well-known MNIST digit database 
and showed remarkable success. The current best rate of 0.3% on the MINST 
digit recognition task approaches human performance [8] . But we have got good 
results with considerably smaller number of training samples. We have also ap¬ 
plied this model to a small AIS based classifier and successfully accomplished 
classification for two categories from a small personal photo collection dataset. 


2 Related Work 

The idea of building a hierarchical structure of features for object detection 
has deep roots in the computer vision literature[3,[Tn]. The general structure 
of the deep convolutional neural network (CNN) was introduced in 1989 by 
LeCun[TT]. His deep convolutional neural network architecture called LeNet is 
what is still being used today with a lot of consistent improvement to the in¬ 
dividual components within the architecture. An important idea of the CNN 
is that the feature extraction and classifier were unihed in a single structure. 
The parameters of both the classifier and feature detector were trained globally 
and supervised by back propagation. After the last stage of feature detection a 
few fully connected layers were added to perform classification. The model was 
proposed for handwritten digit recognition and achieved a very high success rate 
on MINST dataset [l2]. But it demands substantial amount of labeled data for 
training (60,000 for MINST). Also the size of input is very small(28x28) with no 
background clutter, illumination change etc which is an integral part of normal 
pictures/images. Infact for most realistic vision applications this is not the case. 
For instance Ranzato et al m trained a large CNN for object detection (Caltech 
101 dataset) but obtained poor results though it achieved perfect classification 
on the training set. The weak generalization power of CNN when the number of 
training data is small and the number of free parameter is large is a case of over 
fitting or overparametrization. Other biologically inspired models like HMAX 
m use hardwired filter and use hard Max functions to compute the responses 


in the pooling layer. The problem was that it was unable to adapt to different 
problem settings. 

Successful algorithms have been built on top of handcrafted gradient response 
features such as SIFT and histograms of oriented gradients (HOG). These are 
fixed features and are unable to adjust to model the intricacies of a problem. 
The success of object recognition algorithm to a large extent depends on features 
detected. The features should have the most distinct characteristics among dif¬ 
ferent classes while retaining invariant characteristics within a class. Traditional 
hand designed feature extraction is laborious and moreover cannot process raw 
images while the automatic extraction mechanism can fetch features directly. 
The multiple processing layers of machine learning systems extract more ab¬ 
stract, invariant features of data and have higher classification accuracy than the 
traditional shallower classifiers. These deep architectures have shown promising 
performances in image |12j language [15] and speech [T5|. In [T7|, [T5| supervised 
classifiers such as CNNs, MLPs, SVMs and K-nearest Neighbors are combined 
in a Mixture of Experts approach where the output of parallel classifiers is used 
to produce the final result. CNNs though efficient at learning invariant features 
from images, do not always produce optimal classification and SVMs with their 
fixed kernel function are unable to learn complicated invariances. Our approach 
is different as we propose a single architecture for training and testing using 
CNN and AIS principles. 


3 Convolutional Neural Network- Artificial Immune 
System (CNN-AIS) Model 

Our proposed architecture integrates Clonal Selection (CS) principles from Arti¬ 
ficial Immune System(AIS) with Convolutional Neural Networks(CNN). We will 
briefly introduce the Artificial Immune System (AIS) theory and the basic CNN 
structure that we have used in the subsequent sections. Then the hybrid CNN- 
AIS trainable recognition engine is presented followed by results and analysis of 
its merits. 


Artificial Immune Systems (AIS): AIS use Clonal Selection and Negative 
Selection imitating the biological immune system. The main task of the immune 
system is to defend the organism against pathogens. In the human body the 
B-cells with different receptor shapes try to bind to antigens. The best fit cells 
proliferate and produce clones which mutate at very high rates. The process 
is repeated and it is likely that a better B-cell (better solution) might emerge. 
This is called Clonal Selection. These clones have mutated from the original cell 
at a rate inversely proportional to the match strength. Two main concepts are 
particularly relevant for our framework. Generation of Diversity: The B cells 
produce antibodies for specific antigens. Each B cell makes a specific antibody, 
which is expressed from the genes in its gene library. The gene library does not 
contain genes that define antibodies for every possible antigen. Gene fragments 


in the gene library randomly combine and recombine and produce a huge diverse 
range of antibodies 

This helps the immune system to make the precise antibody for an antigen it 
may never have encountered previously. Avidity: Refers to the accrued strength 
of various diverse affinities of individual binding interaction. Avidity (functional 
affinity) is the collective strength of multiple affinities of an antigen with vari¬ 
ous antibodies. Based on this biological process, quite a few Artificial Immune 
System (AIS) have been developed in the past, [22] and [23]. Castro developed 
the Clonal Selection Algorithm (CLONALG) [24] on the basis of Clonal Selec¬ 
tion theory of the immune system. It was proved that it can perform pattern 
recognition. The CLONALG algorithm can be described as follows: 1. Randomly 
initialize a population of individual (M); 2. For each pattern of P, present it to the 
population M and determine its affinity with each element of the population M; 
3. Select n of the best highest affinity elements of M and generate copies/clones 
of these individuals proportionally to their affinity with the antigen which is 
the pattern P. The higher the affinity, the higher the number of clones, and 
vice-versa; 4. Mutate all these copies with a rate proportional to their affinity 
with the input pattern: the higher the affinity, the smaller the mutation rate; 
5. Add these mutated individuals to the population M and reselect m of these 
maturated individuals to be kept as memories of the systems; 6. Repeat steps 2 
to 5 until a certain criterion is met. 


Convolutional Neural Network(CNN) A Convolutional Neural Network 
[22] [24] is a multilayer feed forward artificial neural network with a deep su¬ 
pervised learning architecture. The ordered architectures of MLPs progressively 
learn the higher level features with the last layer giving classification. Two op¬ 
erations of convolutional filtering and down sampling alternate to learn the fea¬ 
tures from the raw images and constitute the feature map layers. The weights are 
trained by a back propagation algorithm using gradient descent approaches for 
minimizing the training error. We have used Stochastic Gradient Approach as 
it avoids being stuck in poor local minima which is highly likely due to the non 
linear nature of the error surface. A simplihed CNN was presented in [5^ which 
we have used for our work instead of using the rather complicated LeNet-5|2f)j. 
The model has five layers. 


CNN-AIS Model Modern architecture trains learning features across hidden 
layers starting from low level details up to high level details. The architecture 
of our hybrid CNN-AIS model was designed by adding an additional layer of 
Artihcial Immune System (AIS) based Clonal Selection (CS) in the traditional 
Convolutional Neural Network (CNN) structure, Fig.l. The model is explained 
layer wise. 

Convolutional Layer: A 2D filtering between input images n, and a matrix of 
kernels/weights K produces the output I where Ik = M (rii * Kj) where 

M is a table of input output relationships. The kernel responses from the inputs 
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Fig. 1. CNN-AIS Hybrid Model 


connected to the same output are linearly combined. As with MLPs a scaled 
hyperbolic tangent function is applied to every I. 

Sub sampling Layer: Small invariances to translation and distortion is ac¬ 
complished with the Max-Pooling operation. This is for faster convergence and 
improves generalization as well. 

Fully Connected Layer - I: The input to this layer is a set of feature maps 
from the lower layer which are combined into a 1 -dimensional feature vector and 
subsequently passed through an activation function 

Clonal selection Layer: This is the new additional layer that we propose in 
our architecture and it is the second last layer. This layer receives its input from 
the fully connected layer-I in the form of 1-D feature vector for all the images (n) 
in the current running batch. Each feature vector in the Feature set undergoes 
Cloning, Mutation and Crossover according to the rules of Clonal Selection to 
generate additional features that satisfy the minimum threshold criteria and 
resemble the particular class. The number of clones is calculated by 
CNum= 77 X affinity (Feature Vectorl, Feature Vector2) ... (i), where rj is the 
cloning constant. Higher the affinity of match the greater the clone stimulus gets, 
the more the cloning number is. On the contrary, the number is less, which is 
consistent with biological immune response mechanism. Mutation frequency is 
defined as Rate, which is calculated by 

Rate = a 1/ affinity (Feature Vectorl, Feature Vector2) (ii) Where a is 
mutation constant. In accordance with (ii), the higher the affinity of match, the 
























































smaller the clone stimulus gets, the lower the mutation frequency is. On the 
contrary, the mutation frequency is higher. Hence from n initial feature sets we 
now have (n x CNum) feature sets. These newly generated feature vectors are 
grouped into batches and individually fed to the output layer and the subsequent 
error is backpropagated to train the kernels of the CNN. Hence from the seeds 
of a few representative images of each class a bigger set is evolved using Clonal 
Selection principles of Artihcial Immune System. End of training phase yields 
a set of representative features, which we call antibodies, from each class of 
size much larger than the original dataset and a trained CNN. Though we start 
with random values of feature sets(antibodies) for each class but eventually they 
converge to their optimal values. 

Output Layer(Fully Connected Layer-H): This layer has one output neuron 
per class label and acts as linear classifier operating on the 1-dimensional feature 
vector set computed from the CS layer 

4 Result 

We performed tests on MNIST dataset. Fig.2 is a plot of comparison of error 
versus training data size for both CNN and CNN -|-AIS hybrid. When available 
data is less then CNN-I-AIS model performs better giving lesser error.lt is evident 
that CNN-AIS error rates are much lower for small data size. Hence AIS helps 
in training CNN better when training data is scarce. Fig.3 shows a plot of error 
versus the number of epochs for our hybrid model for standard data size. As the 
number of epochs increases the error rate decreases and becomes constant after 
15 epochs. 

5 Application: Personal Photo Album 

The CNN-AIS generates a robust and diverse pool of feature vectors and a 
trained CNN for any class. We tested this model for a personal collection of 
photos for two classes Picnic (A) and Conference (N). For every testing image 
(the antigen), the trained CNN-AIS model computes the feature vector and 
compares this with the feature set pool of that class. If the number of matches 
of the test image with the various feature sets of that class and the combined 
affinities exceed the threshold then the testing image is placed in that class. 
These emulate the antibodies in a human body recognizing an antigen. The 
model is shown in Fig 4. 

The trained CNN learns the features of the test image, the antigen. A two 
phase testing mechanism is used for classihcation. The first phase matches the 
test image feature with the 3N feature sets (antibodies) of each of the classes. 
The total number of antibodies lying above the threshold for matching is counted 
(C) for each class. All classes providing a minimum number of C are qualified 
for phase 2 testing. The second phase calculates Avidity for each class which 
is the mean strength of multiple affinities of all qualified antibodies in C with 
the testing image antigen. It is calculated by taking the mean of individual 
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Fig. 2. Error versus training data size 
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Fig. 3. Plot of error rate w.r.t no. of epochs 






Fig. 4. Application: CNN-AIS Model Used for Personal Photo Classification 
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Fig. 5. Application Analysis on Personal Photo Album 



Fig. 6. Sample images from Application dataset 





















































scores (calculated using inner product measure) of matching of test image with 
each antibody above the threshold for each qualified class. This score is labeled 
avidity. The class is eventually decided on the basis of the combined scores 
S=(Count+Avidity). The test images are different from training images and 
may belong to different individuals. The model can be extended to generate a 
new class. If during testing no suitable match happens then the test image can 
initialize a new class/ antibody set with its features set. Mutation generates the 
new relevant population. The experimental results are summarized in Fig.5. The 
sample dataset used for our experiments is shown in Fig.6. Despite the diversity 
in the dataset and the small size of training data set, our model gives good 
results. 

6 Conclusion 

The AIS layer shows a marked improvement in recognition when training data 
is limited. The proposed model can be extended to be used as a classiher for 
personal photo albums as explained by the example application. The results 
are very encouraging. A new class can be added to the existing set of classes 
dynamically replicating the behavioral aspects of self-learning and self evolving of 
human immune system. The results show the efficacy of our model. The proposed 
model is able to capture diversity that is inherent in personal photo collections 
unlike CNN by itself. 
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